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ABSTRACT 

Inferring magnetic and thermodynamic information from spectropolarimetric observations relies on 
the assumption of a parameterized model atmosphere whose parameters are tuned by comparison 
with observations. Often, the choice of the underlying atmospheric model is based on subjective 
reasons. In other cases, complex models are chosen based on objective reasons (for instance, the 
necessity to explain asymmetries in the Stokes profiles) but it is not clear what degree of complexity 
is needed. The lack of an objective way of comparing models has, sometimes, led to opposing views 
of the solar magnetism because the inferred physical scenarios are essentially different. We present 
the first quantitative model comparison based on the computation of the Baycsian evidence ratios for 
spectropolarimetric observations. Our results show that there is not a single model appropriate for 
all profiles simultaneously. Data with moderate signal-to-noise ratios favor models without gradients 
along the line-of-sight. If the observations shows clear circular and linear polarization signals above 
the noise level, models with gradients along the line are preferred. As a general rule, observations 
with large signal-to-noise ratios favor more complex models. We demonstrate that the evidence ratios 
correlate well with simple proxies. Therefore, we propose to calculate these proxies when carrying out 
standard least-squares inversions to allow for model comparison in the future. 

Subject headings: methods: data analysis, statistical — techniques: polarimctric — Sun: photosphere 



1. INTRODUCTION 

Spectropolarimetry is a very powerful diagnostic tech- 
nique. It has allowed us to study in depth the thermody- 
namical and magnetic properties of the solar and stellar 
plasmas. However, the valuable information encoded in 
the Stokes profiles is often difficult to extract. The re- 
sponse of the spectral shape of a given spectral line to 
changes of the properties of the plasma is very convo- 
luted, non-linear and, in many occasions, non-local. 

In spite of the complex relation between the physi- 
cal parameters and the emergent Stokes profiles, sev- 
eral simple diagnostic tools have been developed in 
the past and are still under wide use in solar physics 
Among the m, the line ratio technique (iStenflol 
' 20101 1201 ID. the cen ter-of-gravity method (| Semell 



1973 



1970; 



Rees fc Semell 119790 and the application of calibration 
curves bet ween spectral lin e and magnetic field proper- 
ties (e.g.. iLites et al.l l2008t iMartmez Pillet et all 120111 
for recent applications) have had special relevance. 

During the last few decades we have witnessed the 
development and systematic application of nonlinear 
inversion codes. They extract physically relevant in- 
formation by comparing the observed Stokes profiles 
to those synthesized in appropriate atmospheric mod- 
els. Because of the non-linearity between the phys- 
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ical parameters and the observables, these inversion 
methods make use of elaborate time-consuming non- 
linear optimization methods. The first heroic (be- 
cause of the low computational power available at that 
time) efforts made use of relatively simple physical 
models, of which the Milne-Eddington ( ME) approxi- 
mation is the most widely spread (e.g.. lHarvev et al" 
19721: lAuer et al.lll977t [Landi Degl'Innocenti fc Landolfi 



2004( 1. Although the simplifying assumptions made by 



the ME approximation may not be fully fulfilled in 
real solar plasmas, it is still one of the most widely 
used models, in part, because of its analytical simplic- 
ity. Even state-of-th e -art i nversion codes such as VFISV 
(jBorrero et al.l 120071 l2010f ). used for inferring magnetic 
field vectors from the Helioseismic and Magnetic Imager 
(HMI; onboar d the Solar Dynamics Obs ervatory) data, 
or as MILOS (lOrozco Suarez et al.ll2007D an d MERLIN 
(jSkumanich fc Litesl 119871: ILites et all 12007ft . currently 
applied to data from the Hinode spacecraft, are based 
on this assumption. 

The increase in computational power made it feasi- 
ble to use more elaborate models. A fundamental leap 
forward was the application of the idea of response 
funct ions ([Landi Degl'Innocenti fc Landi Degl'Innocentil 
|1977[ ) to the inversion of Stokes profiles with non- 
trivial depth stratifications of the physical quanti- 
ties. The first representative of this family of codes 
was S IR (Stokes Inversion based on Re sponse func- 
tions; iRuiz Cobo fc del Toro InTestal Il99l . The pres- 
ence of gradients along the line-of-sight (LOS) of the 
physical properties are of importance for explaining 
the st rong asymmetries observed in magnetized re- 
gions dSolanki fc Pahlkel 119881: iGrossmann-Doerth et al.1 
1981 ISolanki " 



Khomenko et al 



Montavonl 
1200, 



119931: iSigwarth et al. 
IMartmez Gonzalez et alj 



1999; 



2008 
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IViticchie fe Sanchez Alm eida 201 l]). Based on the same 
strategy. iSocas-Navarro et all (|1998h developed a code 



capable of dealing with lines in NLTE (non-local ther- 
modynamical equilibrium). This model has been mainly 
applied for the inversion of Ca n infrared triplet lines, 
which are formed under st r ong NLTE conditions (e.g., 
Socas-Navarro et all 120001 : Ide la Cru z Rodrigue z et al.l 



Models based on the assumption of microstructured 
magnetic atmospheres which incorporate small-scale 
fluctuations of the physic al quantities along the LOS 
have also been proposed ([Sanchez Almeida! 119971 ). A 
property of such models is the natural generation 
of asymmetric profiles, which has been proposed as 
an alternative to explain the observed asymmetries 
(ISanchez Almeida et al.l 119891 : ISanchez Almeida fe Litesl 
120001 IViticchie et al.N201fll 

The history of inversion codes displays an interesting 
characteristic. The complexity of the models proposed 
has increased with time thanks to two factors. First, the 
computational power has allowed us to solve the non- 
linear optimization problem faster. Second, the quality 
of spectropolarimetric observations has improved with 
time, with instruments systematically achieving signal- 
to-noise ratios above 1000 and reaching 10 4 in many 
cases. This forces the inversion code to fit minute varia- 
tions of the shape of the Stokes profile. In many cases, 
such small modifications of the shape of the profiles are 
encoding important physical effects. 

Although counterintuitive, the availability of high- 
quality observations has also intensified a discussion that 
has been present from the beginning of the history of in- 
version codes. The selection of the model to be used for 
the inversions is often influenced by subjective reasons. 
A model can be chosen deliberately to be simple because 
of the necessity of inverting la rge maps as fast as possi- 
ble. This is the case of HMI dBorrero et al.l 120101 ) . An 
inversion code based on a given model is sometimes cho- 
sen based on the availability of the inversion code and 
the expertise to use it. A model might be chosen based 
on its capability to fit as much detail of the Stokes pro- 
files as possible. Sometimes the selection of a model is 
driven by spatial resolution. This is the case when there 
is a clear indication in the spectrum that structures are 
not resolved. Other possibilities can be invoked, but they 
sometimes contain a large subjective content. 

The choice of the model is also important because it 
critically affects the interpretation of the inferred param- 
eters. This is specially relevant for parameters that are 
more indirectly related to observablcs, like the magnetic 
field of non-resolved structures. There arc widespread ex- 
amples in the literature, especially when signals are weak. 
For instance, there has been some controversy regarding 
the magnetic field strength in the quiet Sun inferred from 
infrared or visible Fe I lines (e.g.. iKhomenko et al.ll2003t 
Lites fe Socas- Navarro! 120041; iDommguez Cerdena et al.l 



20061: iMartmez Gonzalez et all l2008h ~ More recently, 
conflicting properties of the magnetic field in internet- 
work regio n s of t he qu iet Sun have been o b tained by 
iLites et all (|2008l ) an d lOrozco Suarez et all (|2007[ ) on 
the one hand, and bv lStenflol ( 20101) on the other, from 
the very same Hinode observatio ns (and partially sup - 
ported with data from ground bv lBeck fe Rezael 120091 ) . 
These discrepancies arc only "apparent" because they are 



caused by the application of different modelings when 
the organization of internetwork magnetic fields is still 
an unknown. Once we know the nature of the magnetic 
fields in the quiet Sun or, in other words, we have access 
to the most probable model among all possible ones, the 
results will be reliable. This problem does not arise for 
parameters that are more directly related to observables 
like bulk velocities, field azimuths, magnetic flux, etc. 

This paper does not deal with the actual implementa- 
tion of particular inversion algorithms and codes, or their 
respective efficiencies, but with the suitability of the un- 
derlying physical models to explain the observations. To 
this aim we perform the first fully Bayesian comparison 
of models for the inversion of Stokes profiles. To this end, 
we analyze a sample of Stokes profiles ob served with the 
spectropolarimeter fSP; ILites et a l. 2001) aboard Hinode 
dKosugi et al.ll2007D and with the vector magnetograph 
IMaX (IMartmez Pillet et alJl20Tll ) onboard the Sunrise 
balloon (|Solanki et al.l 12010?) . We discuss, for different 
cases, the model (chosen from a pool of fixed models) 
that is favored by data and their relative probabilities 
based on the computation of the evidence. The evidence 
is calcul ated using the Bayes i an inf erence code devel- 
oped by lAsensio Ramos et al.l (|2007l ) and recent exten- 
sions that we have made to the code to deal with models 
with gradient. 

2. MODEL SELECTION THEORY 

We apply model selection theory to determine which 
is the model best suited for ex plaining the Stokes pro- 
files observed in a pixel (e.g., lTrottall2fJ0l for a gen- 
eral description and more details). Let's assume we have 
A'mod models {A4i,i = l...iVmod} competing to ex- 
plain the same set of observations formally represented 
by D. Here, D will be represented by a formal vec- 
tor d = [/(A 1 ),/(A 2 ),...,g(A 1 ),...,C/(A 1 ),...,F(A 1 ),...] 
whose elements are the values of the Stokes parameters 
/, Q, U, and/or V at certain wavelengths Ai, A2,.... By 
a model we mean an algorithm that depends on a set 



(often, the 



of parameters { = [(>,:.. 0,:i " i X 

temperature at one or several points in a model atmo- 
sphere; the magnetic field strength, in clination, and az- 
imuth; the density, etc), whose output is a prediction 
y(0j) of the data. Th e Bay es theorem (|Javnesl 120031 : 
iMacKavl 12001 I Gregory! 12003) states that the posterior 
probability of each model at the light of the obs erved 
data is 



p(Mi\D) 



p(D\Mi)p(Mj 
p(D) 



(1) 



where p(A4i) is our prior belief in each model (which wc 
will assume to be the same for all the models consid- 
ered here; see below), while p(D) is just a normalization 
constant: 



p(D) = J2 p{D\Mi)p{Mi). 



(2) 



Finally, p(D\A4i) is the evidence or marginal likelihood, 
which is the key ingredient of our model comparison , 
and is given by the following integral (e.g.. iTrottal 120081 : 
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lAsensio Ramosll201lh : 

p(D\Mi)= [d0ip(6i\Mi)p(D\ei,Mi). (3) 

The quantity p(8i\Mi) is the prior distribution for the 
model parameters. If, for example, we assume that 9^ 
(i.e., the j-th parameter of mode i) is a priori distributed 
uniformly between #™- n and 6*™ ax , then 



P{0i\Mi) 



f/imax 







and O^r 

aminl — 

i;j J 



9min ^ n ^ /nnax 
i;j < < ''. .I 

otherwise 



(4) 

The quantity p(D\O il Mi) in Eq. (|3|) is the likelihood, 
which is computed from the observed data. Assuming 
that the observations are corrupted with uncorrelated 
Gaussian random noise, then 



M 



2a] 



p(D\9 l ,M. l ) = l[(2TTa]) 17 exp 

i=i 

(5) 

(for mode details, see lAsensio Ramos et al.l 120071 : 
lAsensio Ra mos 20Q3). In general, we will assume our 
priors to be uniform (Eq. 0]), hence, the evidence in Eq. 
(|3|) simply reads: 



p(D\Mi) 



de tP (D\0 ll M l ) 1 

(6) 

where the integral is computed extended over the volume 
Q which contains the region where the priors are non- 
zero. It is important to note that, if a model parameter 
is completely unconstrained by the observed data (so the 
ensuing likelihood does not depend on this parameter), 
the evidence does not penalize it because it factorizes 
from the integral. 

Given two models, Mo and Mi that are proposed to 
explain an observation, the ratio of posteriors 



P (Mo\D) _ p(Mo) p{D\M Q ) 



p{M\\D) p{M\) p{D\M\ 



(7) 



is used to compute ho w more probab le one model is with 
respect to the other ([Jeffreys! |1961[ ). The ratio of evi- 
dences is known as the Bayes factor, -Boij and it is triv- 
ially given by: 

_ P (D\M ) 

Bm pJd\mYY (8) 

If both models are assumed to have the same a-priori 
probability (which is what we have assumed in all sub- 
sequent computations), the ratio of posteriors is just the 
Bayes factor. Large values of Bqi indicate a preference 
for model Mo while small values indicate a preference for 
model Mi- Table Q] gives the modified Jeffreys scale that 
can be used to tra nslate values of the Bayes factor into 
strengths of belief (IMre vsl flQBll: iKa ss fc Raftervl 119951: 
iGordon fc Trottall200<l 

Therefore, model comparison is a matter of computing 
p(Mi\D) for all models and calculating their ratios for 
pairs of models. As shown by Eq. ((31), the evidence con- 
tains a balance between the quality of the fit (encoded in 
the likelihood) and the number of parameters (encoded 



TABLE 1 

Modified empirical Jeff reys' scale (taken from ITrottaI 
I2008TFI 



I In Boi| 


Odds 


Strength of evidence 


< 1.0 


$3:1 


Inconclusive 


1.0 


-3:1 


Weak evidence 


2.5 


~ 12 : 1 


Moderate evidence 


5.0 


~ 150 : 1 


Strong evidence 



on the prior). In order to gain some insight consider the 
following example. Let Mi be a model with a single pa- 
rameter 9, and Mo another one with this parameter fixed 
to the value 9 . If the prior on the parameter for M i is 
flat with a range A9 sufficiently large to acommodate the 
likelihood, then 



p(9\Mo) = S(9-9 ) 
p(fi\Mi) 



i 

AO 





mm ^ n ^ am ax 

otherwise 



(9) 



Now, let the likelihood be a relatively peaked function 
around the value 9, with a characteristic width 59, then, 
from Eq. (|3|) the evidences in both cases read 



p{D\M )=p(D\e Q ,Mi) 
p(D\Mi)^p(D\§,M!)^, 



(10) 



Assuming identical priors for both models, the evidence 
ratio is 

p(D\Mi) p(D\6,Mi) 59 



p(D\M ) p{D\9 ,M ) A0- 



(11) 



Consequently, since the ratio of likelihoods has to be 
larger than or equal to 1 (model Mi contains model 
A^o)- model Mi is preferred to model Mo when the 
prior space is not so large with respect to the width of 
the likelihood. This shows that the evidence ratio works 
as a Bayesian Occam 's razor. 

The main difficulty in model comparison is that the 
evidence in Eq. ([3]) is computationally very demand- 
ing beca use it is the result of a high-dimensional inte- 
gral (e.g.,[Trotta 20081. and references therein). In recent 
years, some effici ent algorithms , specially those based on 
nested sampling (jSkilling 2003), have been developed to 
deal with this problem. The codes that we use in this 
pape r make use of the Multinest algorithm (jFeroz et al.l 
120091 ) which performs very well in our cases. 

3. OBSERVATIONS 

3.1. Spectropolarimetry 

We select the profiles for the application of the 
Bayesian model comparison for spectropolarimetric data 
from a representative sample of what one can find in the 
solar photosphere. For the quiet Sun, they have been 
extrac ted from the observations analyzed by iLites et aLl 
(2008). They were obtained at disk center on 2007 March 
10 with the spectropolarhnetcr SOT/SP aboard Hinodc 
with a spatial resolution of ~ 0.32". The observed spec- 
tral region consists of the Fe I doublet at 6301.5 and 
6302.5 A with a spectral resolution close to 3xl0 5 , re- 
sulting in a total of 112 wavelength points. After calibra- 
tion, the standard deviation of the noise in Stokes Q, U 



4 



and V in units of the continuum is estimated to be of the 
order of 1.1-1.2x10"" 3 . Likewise, the standard deviation 
of Stokes / computed on a continuum window is esti- 
mated to be ^6xl0~ 3 , the increase being probably pro- 
duced by flat-fielding effects. Given the large computa- 
tional effort that the estimation of the evidence requires, 
we have focused on individual profiles extracted from 
the St okes V classification calculated by IViticchie et al.l 
(|2011l ) using a k-means unsupervised cla ssification al- 
gorithm wid ely used in machine learning (jEverittl 119951 : 
lBishopir2006t ). We have selected individual profiles from 
the observations whose polarization amplitude is, in any 
of the Stokes parameters, above a threshold of 4.5 times 
the standard deviation of the noise. This way, we avoid 
large u ncertainties in the mo del parameters as pointed 
out by lAsensio Ramos! ()2009f ) . The considered profiles 
are shown in black curves in Fig. [1] and their classifica- 
tion, includi ng the nomencla t ure fo r the shape of the pro- 
files used by IViticchie et al.l (|2011l) is displayed in Table 
IH Of the six groups available (network, blue-lobe, red- 
lobe, asymmetric, antisymmetric and Q-like), we have 
picked representative profiles, some of them having no 
apparent linear polarization above the noise threshold 
and some of them showing clear signals. We consider 
that they represent a good sample of what one can find 
in the quiet Sun observed with Hinode. 

Concerning the profiles associated to umbra and 
penumbra, they were extracted from Hinode observations 
obtained on 2007 February 27. The estimated noise level 
in the umbra for Stokes Q, U and V is of the order of 
5xl0 -3 in units of the continuum intensity, while it in- 
creases to 0.02 for Stokes /. The noise level is much 
larger than for the quiet Sun profiles, partly because of 
the reduced number of photons and also because of the 
apperance of molecular lines that we do not fit. 

3.2. Imaging polarimetry 

Four IMaX observations have been chosen to study 
model comparison in imaging polarimetric data. The 
first one is characterized by Stokes V above the noise 
level and Stokes Q and U below the noise. The sec- 
ond one has Stokes Q and U above the noise and V 
below. The third has all Stokes parameters above the 
noise, while the fourth has all Stokes parameters below 
the noise. The analysis of model comparison for the 
inversion of these profiles is of special relevance given 
their low spectral sampling. The observations consist of 
the four Stokes parameters at —80, —40, +40, +80 and 
+227 mA around the Fe I line at 5250.209 A. The spec- 
tral point spread function (PSF) has very extended tails, 
typical of Fabry-Perot instrume nts. Instead of tak ing it 
into account exactly, we follow lLagg et al.l (|2010ft who 
substituted the real PSF by a Gaussian PSF of 85 mA 
(~ 2.9 km s" 1 ) of full width at half-maximum (FWHM). 
Although the Gaussian PSF does not present extended 
tails, its convolution with the FTS spectrum gives line 
pro files similar to those obtaine d using the correct PSF 
(see IMartinez Pillet et al.ll2011ft . 

The estimated noise level is 4xl0 -3 for Stokes / and 
10~ 3 for Stokes Q, U and V, all in units of the contin- 
uum intensity. The spatial resolution of the instrument 
is est imated to be between 0.15" and 0.18" (|Lagg et al.l 
[201?]) . 



Table [5] shows some interesting properties of the ob- 
served Stokes profiles: P tot is the maximum total polar- 
ization, Py is the maximum circular polarization, a and 
A are the area and amplitud e asymmetries respectively 
(e.g., iSolanki fe Stenflol 119861 ). while w and v\ are the 
Stokes V zero-crossing velocity and the Stokes I veloc- 
ity of the minimum. The velocities are related to the 
rest wavelength of the spectral lines (they are not abso- 
lute). The sign of the area asymmetry is chosen equal 
to the sign of the b l uest p eak of Stokes V, following 
IMartinez Pillet et al.l ()1997l ). We have not tabulated the 
asymmetries for the IMaX data because we are not confi- 
dent on their values with only 5 points in wavelength and 
of class 34 because it clearly shows the presence of several 
lobes making the definition of asymmetries invalid. 

4. ATMOSPHERIC MODELS 

Among the infinitely many models one might build 
to reproduce the emergent Stokes profiles from a 
magnetized atmosphere, we consider for our analy- 
sis those most widely used in the literature. All 
of them are based on different approximations for 
the solution of the radiative transfer equation for po- 
larized radiation in a plane-parallel atmosphere (see 
lLandi Degl'Innocenti fe Landolfill200l : 



where z is the spatial coordinate along the ray, S = 
(I, Q, U, V) f is the Stokes vector, e the emissivity vec- 
tor, and K the propagation matrix. 

All models considered assume local thermodynamic 
equilibrium. Therefore, e and K depend on the local 
thermodynamic and magnetic properties of the medium 
but not on S itself. Three types of hypotheses are as- 
sumed for the variation of e and K with z: i) constant 
along the line of sight, ii) some thermodynamic quan- 
tities vary (linearly) but the magnetic field is constant, 
iii) all quantities may vary with z. Finally, we consider 
models with a single atmosphere occupying the whole el- 
ement, and models with two independent atmospheres 
within the same resolution element. In the latter case, 
the emergent Stokes profiles are given by 

S = /Si + (l-/)S a , (13) 

where Si and S2 are the emergent profiles of each of the 
two components, which occupy fractions / and (1 — /) 
of the pixel, respectively. Two possibilities are consid- 
ered for these two components. First, Si corresponds to 
a magnetic component (hence, Q, U, and V are, in gen- 
eral, non-zero) , while S2 forms in a field- free atmosphere 
(hence, Q = U = V = 0). Second, both atmospheres are 
magnetized. 

In these models, we consider the formation of the 
widely used Fe 1 630 nm doublet and the 5250.2 A line. 
The atomic data for the synthesis of the lines has been 
compiled from the VALD database (jPiskunov et al.lfl995l : 
iKupka et al.l[l9 991. The data is summarized in Table [2j 
Collisional broade n ing is treated under the formalism of 
lAnstee fe O'Maral (|1995f ) for the broadening of spectral 
lines (only for allowed transitions), with the velocity pa- 
rameter (a) and the line broadening cross section (a, in 
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units of <Zq, with g p the Bohr rad i us) ob tained from the 
code developed by iBarklem et al.l (|1998l ) . 

Polarimetric data for the imaging polarimctry case is 
calculated from high spectral resolution synthetic pro- 
files convolved with Gaussian transmission functions of 
85 mA width, then sampled at the five spectral positions 
used in the V5-6 mode of IMaX. 

In order to limit the scope of our analysis, we 
left out of our study physical models considering the 
forma tion of the spectral line i n non-LTE conditions 
(e.g., ISocas-Navarro et al.l I200C0 . It is well estab- 
lished that these effects play a negligible role in the 
for mation of the spectral features w e are interested 
in (jShchukina fe Truiillo Buenol 120011) . We did not 
consider, either, models with more than two com- 



pone n ts (e.g.. | BernasconT _fc Solankil [19961 IBeck et al.l 
120071: IBeck fc Rezaeil I2009D or statistical models in 
which the thermodynamic and/or magnetic proper- 
ties of the atmosphere are given statistically (e.g. , 
Sanchez Almeida et all 119961: iSanchez Almeidal 119971; 



Carroll fe Kopfl 120071) . These models have been intro 



duccd to explain some of the most complex observed fea- 
tures which are difficult or impossible to explain with the 
models considered here. Furthermore, we also limit this 
analysis to the Zccman effect, neglecting atomic level po- 
larization. The study of these important class of models 
are left for a further study. 

4.1. Weak-field approximation 

The first model we use for this analysis 
is the well-known weak-field appro xi matio n 
(|Landi DeglTnnocenti fc Landi Degl'Innocentil I1973D . 
It is valid whenever the splitting produced in a given 
spectral line via the Zeeman effect is smaller than the 
intrinsic line broadening. In this approximation, there is 
a very simple relation between the magnetic properties 
of the plasma and the derivatives of the Stokes / profile 
and the emergent Stokes parameters at first order for 
Stokes V and second order for Stokes Q, U (see App. 
A). It is important to point out that, strictly speaking, 
the likelihood in the weak-field approximation depends 
only on D = {Q, U, V}. Stokes / enters only th roug h the 
computation of the derivatives used in Eq. (|A1[) and, 
consequently, is part of the model, not of the observa- 
tionf0. Because of this, the formalism of model selection 
cannot be directly applied to compare the weak-field 
model with the rest of models, because they have to 
share the same set of observations D = {I, Q, U, V}. To 
fix this issue, we propose a slightly revised weak-field 
approximation in which we also model Stokes / as an 
absorption line at central wavelength Xq using: 



/(A) = i-dH i' A ~ A °.: Ws/e , Q 



AA do 



pp 



(14) 



where H(v,a) is a Voigt profile. Therefore, each Stokes 
/ is defined with the aid of the line absorption (d), the 
Doppler width of the line in wavelength units (AAdopp), 
the wavelength shift due to a macroscopic bulk velocity 

6 When noisy quantities are part of the m odel, the likel i- 
hood function has to be mod i fied a ccordingly II Gregory! 120051) . 
Ascnsio Ramos & Manso Sainz (2011) presented such an example 
for the inversion of Stokes profiles with a model with local stray- 
light. 



(^los) and the damping constant (a). To these param- 
eters, we add the LOS component of the magnetic field 
vector (i?||), the projection of the magnetic field vector 
on the perpendicular to the LOS (B±) and the azimuth 
of the magnetic field (x). This approximation to the 
weak-field approximation is also interesting for the V5-6 
mode of IMaX because of the scarcity of points. This 
way, the wavelength derivatives of Stokes / needed for 
defining circular and linear polarization profiles are eval- 
uated with more precision. 

4.2. Milne- Eddington models 

The second simplest model we consider is based on the 
Unno-Rachkovsky solution of the radiative transfer equa- 
tion in a Milne-E ddington atmosphere (seelHarvev et al 
197aiAuer et al.lll977t ILandi Pegl'Innocenti fc Landolfi 



2004 for details). Under this approximation, we assume 
that the ratio between the line absorption coefficient and 
the continuum absorption coefficient does not vary with 
depth in the atmosphere. The same happens with the 
bulk velocity of the plasma and the magnetic field vec- 
tor. Additionally, we assume that the source function 
varies linearly with optical depth along the LOS. Each 
(magnetic or non-magnetic) component is characterized 
by a vector of physical quantities 6 that contains: the 
Doppler width of the line in wavelength units (AAdopp), 
the wavelength shift due to a macroscopic bulk veloc- 
ity (^los)> the gradient of the source function (/?), the 
ratio between the line and continuum absorption coeffi- 
cients (r/ ) for each line and a line damping parameter 
(a). This vector is augmented in the magnetized com- 
ponents with the magnetic field vector parameterized by 
its modulus, inclination and azimuth with respect to the 
local vertical direction (B, 6b and </>b, respectively). Ad- 
ditionally, a filling factor (/) is included to weight the two 
components following Eq. (fl~3|) . The specific equations 
used in our code are shown in App. A. The number of 
unknowns is 16 for the case of one field-free component 
plus a magnetic one (labeled ME1+1) and 19 for the case 
of two magnetic components (labeled ME2). Note that 
the number of wavelength points of the IMaX data is 
similar to the number of free parameters. Consequently, 
we expect the model selection scheme to favor simpler 
models (with less number of free parameters) for IMaX 
observations. 

We use uniform priors for all variables, with the ranges 
indicated in Table [3] It is important to note that the 
ranges of the parameters have to be set up realistically 
since they affect the final value of the evidence. The rea- 
son is that the evidence is the integral of the normalized 
likelihood weighted by the normalized prior. As a con- 
sequence, the larger the prior volume, the smaller the 
evidence. We consider that the values shown in Table [3] 
are a good representation of what we expect a priori. In 
any case, we have empirically tested that modifying the 
range of the parameters (always making sure not to cut 
regions of the space of parameters that are compatible 
with the observations) has a small effect on the evidence 
because of the dominance of the integral of the likelihood. 
However, adding a new parameter strongly modifies the 
evidence because of the increased dimensionality of the 
space of parameters. Note that we avoid the 180° ambi- 
guity by restricting the azimuth to lay between 0° and 
180°. 
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TABLE 2 

Atomic parameters. 



A (A) 


Excitation 


l°g(s/) 


Transition 


<i 


a 


9 


G 




pot. (eV) 














6301.498 


3.654 


-0.745 


5 D 2 - 5 P 2 


839.9 


0.243 


1.667 


2.517 


6302.494 


3.687 


-1.203 


5 D - 5 Pi 


856.9 


0.240 


2.500 


6.250 


5250.209 


0.121 


-4.938 


5 D - 7 Pi 






3.000 


9.000 



The comp utation of the evidence is c arried out with 
Bayes-ME (jAsensio Ramos et aLll2007j n. the Bayesian 
inference code for Milnc-Eddington atmosp heres. This 
code makes use of the Multincst algorithm ([Feroz et al.l 
|2009|). based on th e nested sampling approach introduced 
bv lSkillhigl (|2004[) . One of the free pa rameters of Multi- 
nest is the number of live points (see iFeroz et all 120091 
for more details), which is directly related to the final 
precision of the estimate of the evidence. We have veri- 
fied that niivo = 600 gives enough precision for our pur- 
poses. Concerning the computing time, each inference 
with Bayes-ME takes of the order of one minute, being 
slightly dependent on the number of free parameters. 

4.3. Models with gradients along the LOS 

Some of the Stokes profiles analyzed in this work 
present asymmetries, both in area and in amplitude. It 
is known that amplitude asymmetries can be produced 
with the presence of more than one magnetic compo- 
nent (even with Milnc-Eddington atmospheres), some- 
thing that we already take into account in Eq. ([TU)) . 
On the contrary, area asymmetries can only be produced 
under the presence of correlated gradients between the 
velocity and the magnetic field vector along the LOS. 
For this reason we also consider models that acommo- 
date such gradients and are in local thermodynamical 
equilibrium (LTE), which constitute our most complex 
models. 

Following previou s appr oaches 

dRuiz Cobo fc del Toro Iniestal H992I: 

ISocas-Navarro et all 119981: iFrutiger et al.l I2000D . the 
Harvard- Smithsonian Referen ce Atmosphere model 
(HSRA; IGingerich etHI Il97lh is used as a starting 
point. In this model, the physical quantities defined in 
Table |3] arc perturbed (within the range presented in the 
table) at predefined positions (usually termed nodes) 
to improve the fitting. A polynomial function in the 
continuum optical depth is fit to the value of the nodes 
and added directly to the original HSRA stratification. 
The order of the polynomial depends on the number 
of nodes considered for each physical quantity. If 
only one node is chosen, the HSRA stratification is 
perturbed by adding a constant at every height but 
keeping the original gradients. If two nodes are chosen, 
a straight line is used to modify the HSRA stratification, 
thus introducing additional gradients. In the case of 
three nodes, a parabola modifies the gradient and the 
curvature. For instance, a parabolic function is added to 
the HSRA temperature stratification, where the value at 
three nodes in the atmosphere are chosen in the range 
[-4000,4000] K. 

The number of nodes selected for each physical pa- 
rameter is displayed on Table [3J We fix the number of 

7 All codes can be freely downloaded from the webpagc 
http : //www. iac . es/proyec to/magnet ism. 



nodes for the temperature to three, so that the total 
amount of nodes is six for the two components. In order 
to test which is the importance of gradients, we con- 
sider two options for the LOS velocity and the magnetic 
field strength. The first one assumes that both quantities 
take a constant value throughout the atmosphere (mod- 
els NOGR1+1 and NOGR2, depending on the number of 
magnetic components). The second one introduces a lin- 
ear gradient with optical depth (models LINGR1+1 and 
LINGR2). This opens up the generation of correlated 
gradients which can potentially generate asymmetries in 
the Stokes profiles. The number of nodes of the rest of 
the parameters is kept fixed in both models. 

The evidence is calculated with Bayes-LTE, an up- 
dated version of Bayes-ME which makes use of an ac- 
celerated version of the synthesis core of Nicole (Socas- 
Navarro, de la Cruz Rodriguez, Asensio Ramos, Trujillo 
Bueno & Ruiz Cobo, in preparation) and it is also based 
on the Multinest algorithm. The computation cost of 
Bayes-LTE is larger than that for Bayes-ME given the 
large number of evaluations of the model. Each inference 
with Bayes-LTE takes of the order of 10-20 minutes, be- 
ing slightly dependent on the number of free parameters. 
The synthesis engine uses the Hcrmitian formal solver of 
iBellot Rubio et all (|1998h and obtains the pressure scale 
by putting the model in hydrostatic equilibrium using 
a neural network approach to speed up the calculation. 
Once synthesized, the lines are convolved with a macro- 
turbulent velocity (u ma c) to increase the broadening and 
produce a better fitting. 

5. RESULTS AND DISCUSSION 

5.1. Maximum a-posteriori profiles 

Although a potentially large space of parameters is 
compatible with the observations, the set of parameters 
that maximize the full multidimensional posterior (max- 
imum a-posteriori; MAP) is of some interest. Since we 
use flat priors, this solution is equivalent to the one that 
maximizes the likelihood. Additionally, because we use 
a Gaussian likelihood, this solution is also equivalent to 
the one that minimizes the standard % 2 metric used in 
standard least squares inversion codes. It is important 
to stress that this solution is not distinguished in any 
special way from all those that fit the profiles inside the 
error bars. 

Fig. [T] shows the observed Hinode Stokes /, Q, U and 
V in black, together with the best fits. The first column 
corresponds to the fits with Milnc-Eddington models, the 
second to LTE models without gradients along the LOS 
on the field strength and velocity and the third column 
to LTE models with gradients. In each column, the red 
curves correspond to the case of one magnetic component 
plus a field-free one, while the blue curves refer to the 
case of two magnetic components. Compared with typ- 
ical inversion results, one would say that all models are 
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Stokes I Stokes Q 




Fig. 1. — Maximum a-postcriori fits to the observed Stokes profiles. The black curves are the observed Stokes profiles. The red curves 
correspond to the models with one magnetic and one non-magnetic components, while the the models with two magnetic components 
correspond to the blue profiles. We indicate the value of the x 2 for each fit, where the labels b and r refer to "blue" and "red" curves, 
respectively. 
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TABLE 3 

Model parameters and prior ranges. 



Parameter 


Prior range 


WEAKF 


ME1+1 


ME2 


NOGR1 + 1 


NOGR2 


LINGR1 + 1 


LINGR2 


B ll 


[-3000,3000] G 


1 














B± 


[0.4000] G 


1 
















[0.01,0.08] mA 


1 


2 


2 










"LOS 


[-5, 5] km s _1 


1 


2 


2 


2 


2 


4 


4 





[0,40] 




2 


2 










'/(I 


[0,40] 




4 


4 










a 


[0,0.5] 


1 


2 


2 










B 


[0,3000] G 




1 


2 


1 


2 


2 


4 


6 B 


[0, 180] dog 




1 


2 


1 


2 


1 


2 


4>B 


[0. 180] dog 


1 


1 


2 


1 


2 


1 


2 


T 


[-4000, 4000] K 








6 


6 


6 


6 


v mic 


[0, 4] km s -1 








4 


4 


4 


4 


Vmac 


[0.1.4] km s _1 








1 


1 


1 


1 


d 


[0,1] 


1 














f 


[0,1] 




1 


1 


1 


1 


1 


1 


Total number of parameters 
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16 


19 


17 


20 


20 


24 



TABLE 4 

Log-evidences for all profiles under different models. 





V shape 


SNR V 


SNR L 


WEAKF 


ME1 + 1 


ME2 


NOGR1 + 1 


NOGR2 


LINGR1 + 1 


LINGR2 


Class 


Network 


23.42 


2.77 


1723.81 


1660.62 


1753.04 


1809.28 


1860.67 


1906.46 


1869.03 


Class 1 


Blue-lobe 


19.35 


4.08 


1681.35 


1749.58 


1823.81 


1832.26 


1784.11 


1869.85 


1821.19 


Class 2 


Asymm. 


8.61 


2.69 


2012.98 


1935.33 


2013.65 


1988.22 


1985.49 


1948.34 


1958.16 


Class 4 


Network 


121.61 


11.34 


376.78 


-176.43 


1488.50 


1135.08 


1750.46 


1347.22 


1870.46 


Class 9 


Antisymm. 


7.02 


3.36 


2020.51 


1931.57 


2005.87 


2023.87 


2011.58 


1968.85 


1968.04 


Class 11 


Asymm. 


14.90 


8.39 


1866.99 


1784.53 


1989.53 


1987.54 


2005.74 


1867.97 


1948.63 


Class 17 


Red-lobe 


7.92 


3.46 


1799.50 


1917.24 


2020.91 


2010.74 


1997.67 


1986.40 


1970.23 


Class 25 


Q-like 


6.38 


3.27 


1891.65 


1986.42 


1962.10 


2003.31 


1987.00 


1983.41 


1976.43 


Class 34 


Q-likc 


6.81 


7.16 


1846.99 


1942.06 


1974.09 


1966.59 


1947.35 


1938.07 


1912.06 


Penumbra 




130.91 


71.10 


-16732.98 


336.36 


1096.64 


-342.23 


540.81 


-312.53 


796.47 


Umbra 




25.61 


10.79 


-125.80 


1286.08 


1350.64 


1137.25 


1290.49 


1136.49 


1271.89 


IMaXl 


Large V 


2.83 


4.52 


50.73 


41.30 


20.84 


-34.56 


-57.58 


-41.33 


-105.31 


IMaX2 


Large QU 


9.80 


1.70 


30.42 


20.99 


29.90 


-53.33 


-49.52 


-42.52 


-94.61 


IMaX3 


Large QUV 


6.14 


5.10 


12.67 


27.55 


13.25 


-39.50 


-61.06 


-49.78 


-82.14 


IMaX4 


Weak QUV 


1.30 


1.15 


69.83 


26.97 


4.93 


-30.29 


-81.86 


-63.25 


-140.28 



TABLE 5 

Bayesian Information Criterion for all profiles under different models. 





WEAKF 


ME1 + 1 


ME2 


NOGR1 + 1 


NOGR2 


LINGR1 + 1 


LINGR2 


Class 


1349.17 


1473.11 


1272.03 


1047.49 


951.23 


845.89 


897.67 


Class 1 


1436.13 


1239.75 


1116.04 


1003.70 


1078.10 


874.41 


944.51 


Class 2 


771.29 


889.01 


745.76 


683.87 


669.06 


755.23 


717.76 


Class 4 


4037.33 


5135.94 


1767.55 


2423.49 


1150.73 


1978.35 


832.77 


Class 9 


757.79 


895.62 


758.04 


627.73 


645.41 


717.80 


703.27 


Class 11 


1057.44 


1175.08 


779.94 


674.40 


639.60 


907.18 


729.46 


Class 17 


1196.41 


954.41 


730.28 


661.41 


647.05 


697.42 


684.23 


Class 25 


1014.89 


806.24 


803.24 


687.35 


706.02 


717.65 


705.45 


Class 34 


1096.30 


880.69 


779.09 


756.46 


766.41 


801.63 


807.33 


Penumbra 


38124.65 


3972.00 


2428.08 


5229.84 


3425.94 


5118.68 


2887.95 


Umbra 


3691.00 


843.90 


695.22 


1067.04 


723.85 


1051.54 


737.57 


IMaXl 


77.15 


135.19 


144.92 


1521.15 


1284.42 


1237.18 


2160.23 


IMaX2 


116.64 


182.23 


130.70 


2344.45 


515.75 


1125.61 


1414.07 


IMaX3 


149.54 


153.08 


159.19 


1748.04 


1788.02 


1400.30 


1509.17 


IMaX4 


35.86 


164.37 


182.56 


1650.46 


2080.83 


1924.08 


3796.05 
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TABLE 6 

Properties of the observed Stokes profiles. 





Ptot [%] 


Pv [%] 


a [%] 


A [%] 


vv [km s 1 ] 


vi [km s x ] 


Class 


2.55 


2.55 


12.48 


39.87 


0.24 


0.03 


Class 1 


2.12 


2.10 


48.98 


49.06 


-1.65 


0.04 


Class 2 


0.94 


0.94 


10.91 


-11.28 


-1.27 


-0.86 


Class 4 


13.23 


13.22 


1.27 


12.84 


-0.52 


-0.53 


Class 9 


0.77 


0.76 


10.51 


14.67 


0.09 


-0.23 


Class 11 


1.82 


1.62 


-0.89 


5.93 


-0.71 


-0.26 


Class 17 


0.95 


0.86 


-17.41 


-35.69 


-1.72 


-0.27 


Class 25 


0.70 


0.69 


-18.42 


43.70 


4.81 


0.48 


Class 34 


0.86 


0.74 






-4.27 


-1.06 


Penumbra 


18.93 


17.40 


4.50 


-2.84 


-0.17 


-0.05 


Umbra 


21.10 


21.08 


-4.49 


-5.19 


-0.17 


2.13 


IMaXl 


0.53 


0.28 






-3.24 


1.08 


IMaX2 


0.99 


0.98 






2.36 


1.69 


IMaX3 


0.70 


0.61 






-0.43 


-0.00 


IMaX4 


0.17 


0.13 






-3.17 


1.31 



TABLE 7 

Values of the reduced x 2 for the best fits and their ratios. 





2 

Xmei+i 


2 

AME2 


v 2 

X ME2 
X ME1 + 1 


2 

ANOGRl+l 


2 

ANOGR.2 


V 2 

x NOGR2 
v 2 

X N()GR1 + 1 


2 

ALINGR1 + 1 


2 

ALINGR2 


X LINGR2 
2 

X LINGR1 + 1 


Class 


3.07 


2.58 


0.84 


2.11 


1.85 


0.88 


1.62 


1.68 


1.04 


Class 1 


2.55 


2.23 


0.88 


2.01 


2.13 


1.06 


1.68 


1.78 


1.06 


Class 2 


1.77 


1.41 


0.80 


1.29 


1.22 


0.94 


1.41 


1.28 


0.90 


Class 4 


11.25 


3.69 


0.33 


5.18 


2.30 


0.44 


4.14 


1.53 


0.37 


Class 9 


1.78 


1.43 


0.80 


1.17 


1.17 


1.00 


1.33 


1.24 


0.93 


Class 11 


2.40 


1.48 


0.62 


1.27 


1.16 


0.91 


1.75 


1.30 


0.74 


Class 17 


1.91 


1.37 


0.72 


1.24 


1.17 


0.94 


1.28 


1.20 


0.93 


Class 25 


1.58 


1.53 


0.97 


1.30 


1.30 


1.00 


1.33 


1.25 


0.94 


Class 34 


1.75 


1.48 


0.85 


1.46 


1.44 


0.99 


1.52 


1.48 


0.97 


Penumbra 


8.65 


5.16 


0.60 


11.44 


7.37 


0.64 


11.15 


6.12 


0.55 


Umbra 


1.67 


1.29 


0.78 


2.15 


1.34 


0.62 


2.07 


1.32 


0.64 



able to do a good job on fitting the full set of profiles. The 
value of the reduced \ 2 metric is shown in each panel, for 
the i+f components red (r) and 2 components blue (6) 
models. They are displayed again in Tab. where we 
also show the reduced y 2 ratio between the model with 
2 components and the model with 1+i components. We 
get small values (close to f) for all fits, indicating that the 
fits are acceptable, something that is also relevant from 
a pure visual inspection. In general, we find a quite sys- 
tematic decrease of the reduced \ 2 when adding a second 
magnetic component, meaning that the fit is marginally 
improved. The decrease of the reduced x 2 when adding a 
second magnetic component is especially large for a few 
profiles (e.g., class 4 and penumbra). They coincide with 
the observations with the largest signal-to-noise (SNR). 
This is an expected behavior in general, consequence of 
an increase on the number of free parameters, that gives 
a larger flexibility to the model to fit more details of the 
observed Stokes profiles. If the SNR of the observation 
is large, even a relatively good fit obtained with the 1+1 
models will produce a large \ 2 (this is what happens with 
class 4 and penumbra), which will be largely reduced by 
allowing more freedom to the model (using two magnetic 
components). However, this is not always the case, given 
that some ratios reported in Tab. [7] are larger than 1. 
This is the case for classes and I in the models with gra- 
dients along the LOS. They correspond to profiles with 
a low polarization amplitude and strong Stokes V asym- 
metries. In summary, it is hard to say from these fits 
which model is preferred over the others. Pragmatically, 



in light of the quality of the fits, one would choose the 
simplest model. However, the question is whether an im- 
provement in the x 2 is worth the increase in the number 
of parameters. 

Model selection is carried out by comparing evidences, 
i.e., computing the integral of the posterior over all model 
parameters. Therefore, the specific values of the parame- 
ters is irrelevant. However, for the sake of completeness, 
we have displayed in Tab. [5] of Appendix [B] the maxi- 
mum a-posteriori values of some parameters for all at- 
mospheric models considered and for all observed Stokes 
profiles. Concerning the magnetic flux density, we see 
that the inferred value is quite robust to the specific 
model, except for the umbral profile (in which this quan- 
tity is not related to the amplitude of the Stokes V pro- 
file) and some IMaX profiles (for which the information 
is scarce). This is a consequence of the fact that, if the 
magnetic field strength is not too strong, the magnetic 
flux density is almost an observable. On the contrary, 
there is a large variability on the inferred value of the 
filling factor, something that directly influences the field 
strength and the inclination. We conclude that the in- 
ferred MAP values depend on the selected model and 
that model selection turns out to be important. 

5.2. Model comparison 

Once the Bayesian evidence is computed for all mod- 
els considered, model comparison is just a matter of 
comparing real numbers and decide on the most prob- 
able model following Table [T] Due to the potentially 
small/large value of the evidence, we reported the value 
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of \n.p{D\Mi) in Table |3J We indicate in bold red the 
model with the largest evidence and in italic red those 
models that are in competition with the selected model 
according to the Jeffreys' scale. Given that we cannot 
discard the presence of more models compatible with 
the data, the evidence cannot be considered as an abso- 
lute scale. Consequently, we can only compute evidence 
ratios assuming the same a-priori probability for every 
model. For this reason, it is also illustrative to consider 
model comparison in a league framework. This is shown 
in Fig. [21 where each square indicates the value of the 
logarithmic evidence ratio obtained from the competi- 
tion of pairs of models (model in the vertical axis versus 
model in the horizontal axis). Red colors point out that 
the model in the left axis is preferred with respect to 
the model in the horizontal axis. Blue colors are used 
when the model in the vertical axis is the least preferred 
one. Light red (blue) colors are associated with the sec- 
ond most (less) probable model. Obviously, only half of 
the squares contain relevant information, with the other 
half showing redundant data because they are antisym- 
metric with respect to the diagonal (with a sign change, 
equivalent to an inverse in a linear scale for the evidence 
ratio) . 

The first thing to note is that there is not a single 
model that can be considered the "best" for all profiles. 
A Milne-Eddington model with two magnetic compo- 
nents seems to be th e most probable for c lasses 2, 17 
and 34. According to IViticchie et al.l (j 2 1 lh , they only 
represent ^9% of the field-of-view. Classes 0, 1 and 4 fa- 
vor models with gradients along the LOS in velocity and 
magnetic field, while classes 9, 11 and 25 favor models 
without gradients in velocity and magnetic field. A sim- 
ple weak-field approximation is the model of choice for 
the majority of IMaX data. Moreover, there is not a sin- 
gle "worst" model, although ME1+1 can be considered 
not appropriate among the selected models, in general, 
for explaining our quiet Sun Hinode profiles. This might 
sound strange given that many spectropolarimetric inver- 
sions of Hinode data have been carried out with ME1+1 
models. Our results demonstrate that this model is often 
not preferred by data while the opposite occurs for the 
ME2 model. The reason has to be found on the presence 
of asymmetries that a ME1+1 model cannot fit. Ob- 
viously, this conclusion is based on the limited number 
of models that we have considered. If one proposes a 
different model (there is no difference whether this is a 
different atmospheric model or one from our selection but 
with some parameters fixed), it is a matter of computing 
the evidence to find out if it is preferred by the data. 

Another property of our model comparison is that, 
normally, one model is orders of magnitude much more 
probable than the rest. Only a few models are really in 
the position of competing. For instance, for class 9, we 
see that NOGR1+1 has a log-evidence ~3.4 larger than 
WEAKF. According to Table [TJ there is moderate evi- 
dence that NOGR1+1 is preferable to WEAKF. On the 
contrary, for class 2, we can safely say that there is not a 
clear preference for ME2 or WEAKF2 (the difference in 
log-evidence is smaller than 1), with the two models be- 
ing much more probable than the rest. Other examples 
can be found by looking at the comparisons of Fig. [2J 

Among the quiet Sun profiles, only for classes 0, 1 and 
4 the analysis favors models with linear gradients on the 



field strength and LOS velocity, with class 4 preferring 
a model with two magneti c components. Classe s and 
4 have been associated by IViticchie et al.l (|2011l) to the 
network and they present large amplitudes of circular 
and linear polarization (class arriving to 2.55% and 
0.32% in Stokes V and L, respectively, and class 4 reach- 
ing 13.2% and 1.30% in Stokes V and L, respectively). It 
is apparent from these results that large SNR ratios are 
important to favor (or at least allow) more complicated 
models, especially when the Stokes profiles are inherently 
complex (asymmetries, several components clearly visi- 
ble). It is important to point out that the higher SNR is 
the responsible for the need of increased complexity just 
because it does not suppress the important details of the 
Stokes profiles that arise in complex atmospheres. For 
example, classes and 1 favor LINGR1+1 because the 
SNR is relatively high and the asymmetry of the Stokes 
V profiles is clearly visible. Consequently, an increase in 
the complexity of the model is compensated by the in- 
crease on the quality of the fit. On the other side we find, 
for example, class 2, with small asymmetries and favor- 
ing simple models. Additionally, large SNR in Stokes Q 
and U are also crucial to favor more complicated models. 
In our case, class 4 have the largest linear polarization 
signal in the whole set and thus favors complex mod- 
els even though the asymmetries of Stokes V are small, 
while classes and 1 are among the ones with the small- 
est amplitude of linear polarization. 

In summary, if the SNR is large, it is possible to in- 
crease the number of model parameters if we are able to 
fit features in the profile that are well above the noise 
level. This is also consistent with the fact that the MAP 
fit gives the smallest x 2 for class 0, 1 and 4. Note that, 
even if the \ 2 of class under LINGR1+1 and LINGR2 
models is very similar, LINGR1+1 is preferred because 
of the smaller number of free parameters. 

A ME2 model is preferred for all profiles belonging to 
classes 2, 17 and 34. In general, these classes have linear 
polarization profiles at or below the noise level (except 
class 34, which presents clear Stokes Q profiles). Of in- 
terest is also the fact that classes 2 and 9, which have the 
WEAKF model in competition with more complex mod- 
els, present no detection of linear polarization, while the 
circular polarization profiles are nearly antisymmetric. 
The model comparison is suggesting that there is limited 
information in such profiles and that a WEAKF model 
can do a good work extracting all available information. 
A different problem is, obviously, how to interpret this 
information. 

A particularly interesting case is the profile of class 
34. The Stokes V signa l is clearly Q-like as described 
bv IViticchie etUI (|20ll . giving the idea of several mag- 
netic components with opposite polarities inside the res- 
olution element. The Bayesian analysis suggests that a 
ME2 model is enough. This seems reasonable, although 
it is clear that the best fit is not able to correctly fit the 
two polarities in Stokes V. The reason is that the am- 
plitude of Stokes Q is even larger than that of Stokes V 
and there is not a clear hint of such two components in 
linear polarization. 

Concerning the Hinode profiles observed in sunspots, 
Table 2] points out that a ME2 model is the most suit- 
able one. This model already produces a very good fit 
for the two profiles. For the umbra, this preference for 
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Fig. 2. — Logarithmic evidence ratio from each model with respect to the every other model. Models can be compared using these tables 
if we assume the same a-priori probability for all of them. Each square reports the log evidence ratio between a given model in the vertical 
axis versus a certain model in the horizontal axis. Red (and light red) indicate the most probable (and second most probable) model in 
each column, while blue (and light blue) show the less probable (and second less probable) model in each column. Note that these tables 
are symmetric with respect to the diagonal. 
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ME2 is also understood at the light of the enhanced noise 
due to the reduced number of photons or the presence of 
molecular lines not accounted for in the inversion. 

The comparison of models for explaining IMaX pro- 
files is also very illuminating. The reduced number of 
data points strongly suggests that the simple weak-field 
approximation is the model of choice for explaining the 
observations among the ones considered in this work. 
This is especially relevant for the profiles in which either 
circular or linear polarization (or both simultaneously) 
is small. This experiment with IMaX data (with the 
preference for very simple models) shows that complex 
models with a relatively large number of free parameters 
are only favored when the number of sampling points in 
wavelength is large, at least larger than the 25 points 
of IMaX data. If the sampling is sufficient, the infor- 
mation encoded in the Stokes profiles about the model 
parameters compensate for the increase in the prior vol- 
ume of complex models. In case both polarizations are 
clearly above the noise level, model ME1+1 is preferred, 
clearly suggesting that some information about the sub- 
structure in the pixel can be extracted from the obser- 
vations (apart from the magnetic flux density that can 
be obtained from the WEAKF approximation), which 
might be surprising for such a poor spectral sampling. 
Note that the weak-field approximation allows one to ex- 
tract very simple quantities from the observables, with- 
out allowing for an explicit substructure inside the pixel. 
We point out that it might be possible to find more com- 
plex models but with a reduced number of parameters 
(smaller than ME1+1) so that it can be preferred by 
data. The only way to decide on this is to calculate the 
value of the evidence and compute the evidence ratio. 

5.3. Simpler proxies 

Given that calculating a reliable estimation of the evi- 
dence is computationally very demanding, it is of interest 
to compare it with simpler proxies used for model com- 
parison. The property of such proxies is that they can 
be calculated very fast and it is not necessary to perform 
the multidimensional integral of the evidence. Two of the 
simple, routinely use d proxies are th e Bayesian Informa- 
tion Criterion (BIC; lSchwardll978f) an d the Akaike In- 
formation Criterion fAIC; IAkaikdll974[) . Both methods, 
which are based on the crude approximation of gaussian- 
ity of the posterior with respect to the model parameters, 
are extremely simple to calculate: 

BIC = -21n£ max + khiN (15) 
AIC = -21n£ max + 2/c, (16) 

where k is the number of free parameters of the model, 
£max is the peak value of the likelihood (at the least- 
squares solution if flat priors are used) and ./V is the 
number of observed points. In the case of a Gaussian 
likelihood, they transform to: 

BIC = xi in + felniV 

AIC = xL„ + 2fc, (17) 

which can be readly calculated for standard inversion 
methods based on a least-squares minimization, using 



the set of parameters 6 that minimize the x 2 : 

XL^E(^-^) 2 . (18) 

One of the fundamental problems of these criteria (apart 
from the assumption of gaussianity of the posterior) is 
that they penalize all parameters equally, not taking into 
account situations in which data does not constrain some 
parameters. 

The computed values of the BIC arc shown in Table 
[3) The model with the smallest value of the BIC is the 
preferred one, contrary to what happens with the evi- 
dence. This model is indicated in bold red when it gives 
the same result using the Bayesian evidence and in blue 
when it gives a different result. When comparing two 
models, we have verified that more than 80% of the time 
the BIC picks up the same model selected by the evidence 
ratio when dealing with Hinode observations, while this 
value increases to ~90% when focusing on IMaX pro- 
files. The success rate for selecting the best model using 
the BIC (as compared with the fully Bayesian case) goes 
down to 73%. We consider this an indication that the 
BIC is a very good proxy for the Bayesian evidence. The 
AIC performs similarly, with BIC being slightly more ro- 
bust. For instance, the LINGR2 model is preferred by 
all model comparison methods for the profile associated 
to Class 4. On the contrary, the weak-field approxima- 
tion is preferred for Class 9 according to the evidence 
ratio, while the more complex NOGR2 model is the one 
of choice according to BIC. Note that, according to the 
evidence ratio, this is the next preferred model in the 
hierarchy. 

In any case, given the relatively large success rate of 
BIC for comparison of two models, we suggest anyone 
carrying out standard inversions to compute the value of 
the BIC for the selected model. This facilitates model 
comparison in the future and is able to select the more 
probable model in a comparison of two with ~80% con- 
fidence if our results are used as a calibration. Another 
application of interest of the proxy is to estimate the 
minimum number of wavelength points used to sampled 
the Stokes profiles when observing an unresolved mag- 
netic structure. If one confronts a ME model with one 
magnetic component and a ME1+1 model to obtain in- 
formation about the filling-factor, the ME1+1 model will 
be preferred when BIC(ME1+1) is larger than BIC (ME). 
With the estimated values of the x 2 j one can infer the 
number of wavelength points to prefer ME1+1. 

6. CONCLUSIONS 

We have presented the first quantitative Bayesian com- 
parison of models used for the interpretation of observed 
Stokes profiles. Our results suggest that there is not 
a single model that is suitable for explaining different 
Stokes profiles in the quiet Sun and in active regions. In 
essence, the selected model in each case depends on the 
amount of information encoded in the observations. Sim- 
pler models are preferred when the SNR is low or when 
the spectral sampling is poor because this information 
is diluted by the noise. Even if the underlying physics 
is very complex and is producing inherently very com- 
plicated Stokes profiles, the presence of noise destroys 
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the information and a simple model is enough to explain 
them. More complex models are favored when the SNR 
is large (especially if this is the case for the four Stokes 
parameters simultaneously) because minute variations of 
the shape of the Stokes parameters have to be fitted. We 
stress again that the higher SNR is the responsible for 
the need of increased complexity just because it does not 
suppress the important details of the Stokes profiles that 
arise in complex atmospheres (asymmetries, several com- 
ponents, etc.). Then, we conclude that the complexity 
of the observed Stokes profiles is the main driver for fa- 
voring more elaborate models. 

Given that the Bayesian evidence is computationally 
heavy, we suggest to use the BIC as a trivial output of 
any inversion code. This will facilitate a more quantita- 
tive model comparison in the future, if used as a proxy 
of the Bayesian evidence. Additionally, the good behav- 



ior of the BIC can be used to develop an mctainvcrsion 
scheme in which any of the Eqs. (|17p is minimized mod- 
ifying the values of the free parameters of the models 
and the models themselves. The result would be a good 
approximation to the best model that is allowed by the 
data. 
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APPENDIX 
APPENDIX 

For the sake of clarity, we write the explicit expressions we have used in each model for the computation of the 
Stokes profiles. 



Weak-field approximation 

In the weak-field approximation, these are the relations between Stokes Q, U and V and Stokes / 
(jLandi Degl'Innocenti k Landolfill2004D : 

13 75 \ 2 p 9I(X) 



F(A) = -4.67 x lQ-^gX^ 



dX 



Q(A) = -5.45 x \Q- w GX i B 2 L cos2 X ^$- 

aX z 

C/(A) = -5.45 x io- 26 GA 4 Bi sin 2x^4^, (Al) 

oX z 

with the wavelength A in A and the components of the magnetic field vector measured in G. The factor q is the effective 
Lande factor and G is the equivalent for linear polarization (e.g.. lLandi Degl'Innocenti k L andolfi 2004). The relations 
at second order for Stokes Q and U are only valid for non-saturated lines. In our modified weak-field approximation, 
Stokes / is given by Eq. (jT4")) and its derivatives can be computed analytically. 



Milne- Eddington solution 

In the Milnc-Ed dington approximation, the emerg e nt Sto kes profiles normalized to the continuum intensity are given 
by Eqs. (9.110) of lLandi Degl'Innocenti k Landolfil (|2004f ). that we rewrite here: 



where 



Ic 

Q(m) 

Ic 
Ic 



1 + 


-P/iA-^l + ki) [(H 


- M 2 + f Q 


+ fu + fv] 


} 






- to A- 
l + /3fi 


' 1 {(l + fc 7 ) 2 fc Q -(l- 


h k I ){k u f v 


- k v fu) + 


fQ( k Q.fQ - 


\- kufu - 


Vkvfv)} 


- to A" 
l + /3fi 


- 1 {(l + fc 7 ) 2 fc t /-(l- 


h ki)(kvfQ 


- kqfv) + 


fu{k Q f Q - 


\- kufu - 


^k v fv)} 


- A - 
1 + 


- 1 {(l + fc 7 ) 2 fcy-(l- 


1- ki){k Q fu 


- kuf Q ) + 


fv{k Q f Q - 


\- kufu - 


Vkvfv)}, 



(A2) 



A = (1 + kjf + (1 + fc 7 ) 2 (/ 2 + fl T + f^-k 2 Q -k 2 u - k 2 v ) - (k Q f Q + kufu + kvfvf. (A3) 



The ki with i = {/, Q, U, V} and fj with i = {Q , U, V} are the elements of the propagation matrix, as defined in Eq. 
(9.39) of lLandi DeriTnnocenti k Landolfil (|200l . 
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MAXIMUM A-POSTERIORI VALUES 

Table |8] shows, for all models except the weak-field approximation, the maximum a-posteriori values of: the field 
strength (B), field inclination {6b), filling factor (/) and the magnetic flux density (0). In these models, the magnetic 
flux density is computed as 4> — fBi cos([0b]i) + (1 — f)B% cos([6>b] 2 )- Given the special nature of the weak-field 
model, we only tabulated the magnetic flux density, which is given by Bn since there is only one component. Since we 
use flat priors, the tabulated values coincide with those one would obtain using a standard least-squares fitting of the 
observed profiles. For the models with two magnetic components, we show the value of B and 9b in both components. 
For the models with gradients along the linc-of-sight, we tabulate the value at logrsooo = — 2, as a representation of 
the conditions in the line formation region. We have not shown the error bars to avoid crowding. 
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TABLE 8 

Maximum a-posteriori values for some parameters. 
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999.86 


1298.82 


2532.40 


2279.34 


2899.99 


1703.17 


2684.40 


IMaXl 


-6.11 


-11.44 


-0.18 


4.64 


68.47 


-251.23 


-1034.43 


IMaX2 


43.97 


51.38 


-31.95 


135.27 


7.88 


465.60 


15.86 


IMaX3 


-18.86 


-14.42 


-77.15 


-281.55 


-748.89 


-376.38 


-514.77 


IMaX4 


-1.39 


1.42 


-1.99 


-34.24 


125.99 


0.21 


1020.42 


B [G] 


Class 




1285.24 


104.79/234.61 


228.00 


247.00/82.20 


367.00 


357.00/47.90 


Class 1 




114.62 


59.03/81.74 


81.10 


52.00/96.70 


143.00 


68.10/229.00 


Class 2 




49.73 


58.35/54.31 


58.20 


52.20/48.70 


103.00 


36.40/178.00 


Class 4 




932.46 


617.19/864.72 


977.00 


538.00/819.00 


814.00 


844.00/567.00 


Class 9 




29.80 


41.18/42.59 


58.60 


79.00/44.30 


102.00 


106.00/29.90 


Class 11 




218.74 


78.82/179.61 


207.00 


167.00/161.00 


395.00 


155.00/69.30 


Class 17 




223.00 


93.53/52.60 


157.00 


41.60/151.00 


248.00 


134.00/57.30 


Class 25 




15.98 


58.29/48.57 


95.90 


82.00/54.20 


191.00 


213.00/57.00 


Class 34 




218.70 


206.35/198.37 


298.00 


126.00/337.00 


488.00 


545.00/136.00 


Penumbra 




1337.04 


1299.85/1337.64 


1320.00 


1480.00/930.00 


1430.00 


1400.00/1270.00 


Umbra 




2927.44 


2888.75/2253.19 


3030.00 


2630.00/3810.00 


2480.00 


3800.00/2110.00 


IMaXl 




1369.61 


59.87/97.49 


2650.00 


1870.00/62.70 


2230.00 


2170.00/82.90 


IMaX2 




963.44 


166.10/34.36 


381.00 


259.00/41.20 


798.00 


43.00/177.00 


IMaX3 




435.38 


140.52/99.94 


1530.00 


84.60/3420.00 


1530.00 


2260.00/88.70 


IMaX4 




398.50 


1.93/51.77 


572.00 


13.30/3090.00 


1890.00 


102.00/3260.00 


B B [deg] 


Class 




178.52 


117.86/137.50 


149.30 


158.60/131.60 


156.40 


139.50/130.70 


Class 1 




64.05 


103.24/63.14 


40.70 


61.50/169.80 


40.30 


54.10/22.80 


Class 2 




54.85 


69.66/87.38 


52.00 


65.90/72.50 


40.40 


67.80/92.90 


Class 4 




24.34 


25.29/25.75 


15.00 


43.40/17.80 


23.80 


23.60/23.10 


Class 9 




3.57 


64.72/46.25 


48.30 


71.20/74.90 


20.00 


75.20/63.60 


Class 11 




75.29 


51.24/79.07 


71.60 


80.20/69.10 


66.20 


80.00/58.10 


Class 17 




78.38 


79.37/59.38 


72.90 


76.20/75.10 


70.10 


75.50/77.50 


Class 25 




88.13 


98.26/125.90 


104.90 


84.90/102.10 


104.30 


86.00/104.20 


Class 34 




77.46 


77.68/89.36 


84.10 


87.60/87.10 


82.20 


93.90/87.10 


Penumbra 




51.47 


53.44/64.39 


53.00 


49.80/59.30 


50.10 


50.50/59.60 


Umbra 




17.80 


21.52/21.43 


15.90 


26.20/27.20 


16.80 


27.70/26.50 


IMaXl 




100.67 


93.36/28.90 


89.10 


79.60/96.90 


109.70 


186.10/94.70 


IMaX2 




13.36 


137.65/128.97 


47.20 


112.50/36.90 


19.80 


31.50/111.70 


IMaX3 




103.04 


137.34/138.23 


111.70 


97.00/152.30 


121.40 


143.80/96.40 


IMaX4 




75.73 


140.44/126.82 


96.70 


89.10/40.50 


89.20 


92.10/34.00 


/ 


Class 




0.05 


0.85 


0.62 


0.52 


0.59 


0.51 


Class 1 




0.62 


0.20 


0.66 


0.47 


0.60 


0.50 


Class 2 




0.49 


0.81 


0.63 


0.60 


0.57 


0.56 


Class 4 




0.86 


0.32 


0.53 


0.42 


0.69 


0.32 


Class 9 




0.13 


0.84 


0.63 


0.58 


0.58 


0.53 


Class 11 




0.46 


0.19 


0.68 


0.35 


0.61 


0.43 


Class 17 




0.35 


0.80 


0.52 


0.39 


0.54 


0.54 


Class 25 




0.21 


0.63 


0.42 


0.50 


0.54 


0.41 


Class 34 




0.17 


0.17 


0.47 


0.54 


0.48 


0.45 


Penumbra 




0.87 


0.72 


0.68 


0.64 


0.91 


0.72 


Umbra 




0.47 


0.74 


0.78 


0.47 


0.72 


0.54 


IMaXl 




0.05 


0.96 


0.11 


0.22 


0.33 


0.48 


IMaX2 




0.05 


0.10 


0.52 


0.19 


0.62 


0.80 


IMaX3 




0.15 


0.09 


0.50 


0.76 


0.47 


0.28 


IMaX4 




0.01 


0.98 


0.51 


0.95 


0.01 


0.62 
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